Word Translation Prediction for Morphologically Rich Languages with Bilingual Neural Networks

نویسندگان

  • Ke M. Tran
  • Arianna Bisazza
  • Christof Monz
چکیده

Our approach enables: • accurate prediction of target translation stem and suffix given fixed amount of context • automatic learning of relevant features with neural network architecture Choosing the correct surface form requires linguistic features of source and target context: • in phrase-based SMT, access to source context depends on phrase segmentation • linguistic features depend on available annotation tools and manual feature engineering

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word Representation Models for Morphologically Rich Languages in Neural Machine Translation

Dealing with the co mplex word forms in morphologically rich languages is an open problem in language processing, and is particularly important in translation. In contrast to most modern neural systems of translation, which discard the identity for rare words, in this paper we propose several architectures for learning word representations from character and morpheme level word decompositions. ...

متن کامل

A Distributed Inflection Model for Translating into Morphologically Rich Languages

Lexical sparsity is a major challenge for machine translation into morphologically rich languages. We address this problem by modeling sequences of fine-grained morphological tags in a bilingual context. To overcome the issue of ambiguous word analyses, we introduce soft tags, which are under-specified representations retaining all possible morphological attributes of a word. In order to learn ...

متن کامل

Learning Bilingual Phrase Representations with Recurrent Neural Networks

We introduce a novel method for bilingual phrase representation with Recurrent Neural Networks (RNNs), which transforms a sequence of word feature vectors into a fixed-length phrase vector across two languages. Our method measures the difference between the vectors of sourceand target-side phrases, and can be used to predict the semantic equivalence of source and target word sequences in the ph...

متن کامل

Providing Morphological Information for SMT Using Neural Networks

Treating morphologically complex words (MCWs) as atomic units in translation would not yield a desirable result. Such words are complicated constituents with meaningful subunits. A complex word in a morphologically rich language (MRL) could be associated with a number of words or even a full sentence in a simpler language, which means the surface form of complex words should be accompanied with...

متن کامل

Induction of Fine-Grained Part-of-Speech Taggers via Classifier Combination and Crosslingual Projection

This paper presents an original approach to part-of-speech tagging of fine-grained features (such as case, aspect, and adjective person/number) in languages such as English where these properties are generally not morphologically marked. The goals of such rich lexical tagging in English are to provide additional features for word alignment models in bilingual corpora (for statistical machine tr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014